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Abstract —Robust Principal Component Analysis (RPCA) via rank minimization is a powerful tool for recovering underlying low-rank structure 
of clean data corrupted with sparse noise/outliers. In many low-level vision problems, not only it is known that the underlying structure of 
clean data is low-rank, but the exact rank of clean data is also known. Yet, when applying conventional rank minimization for those problems, 
the objective function is formulated in a way that does not fully utilize a priori target rank information about the problems. This observation 
motivates us to investigate whether there is a better alternative solution when using rank minimization. In this paper, instead of minimizing 
the nuclear norm, we propose to minimize the partial sum of singular values, which implicitly encourages the target rank constraint. Our 
experimental analyses show that, when the number of samples is deficient, our approach leads to a higher success rate than conventional 
rank minimization, while the solutions obtained by the two approaches are almost identical when the number of samples is more than sufficient. 
We apply our approach to various low-level vision problems, e.g. high dynamic range imaging, motion edge detection, photometric stereo, 
image alignment and recovery, and show that our results outperform those obtained by the conventional nuclear norm rank minimization 
method. 

Index Terms —Robust principal component analysis, rank minimization, sparse and low-rank decomposition, truncated nuclear norm, 
alternating direction method of multipliers. 
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1 Introduction 

V Arious low-level vision applications, including High 
Dynamic Range (HDR) [35], [36], photometric 

stereo [3], [23], batch image alignment [38] and 

factorization-based structure from motion [5], [41], 

can be formulated as a low-rank matrix recovery problem. 
Low-rank matrix approximation methods, such as Principal 
Component Analysis (PCA) [29] and matrix factorization 
[7], [18], [40], [48] are widely used to find the best 
approximation of an underlying low-rank structure of 
data. However many of these approaches are error-prone 
due to the presence of outliers. To recover the low-rank 
matrix while rejecting outliers, a rank minimization based 
Robust Principal Component Analysis (RPCA) [9] has 
been proposed and gained much interest in computer 
Hsion [28], [38], [45], [47]. 

RPCA [9] aims to recover a low-rank matrix A G 
from corrupted observations O = A + E, where E G 
represents errors with arbitrary magnitude and distribution. 
The rank minimization approach [9], [10], [39], [43] assumes 
E is sparse and formulates the problem as: 


min rank(A) + A||E||o, s.t. O = A + E, (1) 

A,E 

where H-Ho denotes the /^-norm, and A is the relative weight 
between the two terms. Unfortunately, solving the problem 
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in Eq. (1) is an NP-hard problem. The rank minimization 
approach instead solves an approximated problem by con¬ 
vex surrogate as: 

min IIAll* + A||E||i, s.t. O = A + E, (2) 

A,E 

where ||A||* = nuclear norm of A, cFi{A) 

represents the i-th singular value of A (sorted in decreasing 
order), and ||E||i is the /^-norm of E. Eq. (2) can be solved 
effectively by various methods [31], [44]. Wright et al. [43] 
and Candes et al. [9] proved that, under mild conditions, the 
unique solution of Eq. (2) exactly corresponds to the solution 
of the original NP-hard problem in Eq. (1). Yet, when the 
number of observations in O is very limited, experiments 
show that the converged solution includes some outliers 
as inliers and vice versa. Such limited number of observa¬ 
tions is not uncommon in many computer vision problems 
due to practical reasons. Eor example, in HDR context, 
often only 2-4 differently exposed images are captured 
and photometric stereo requires only 3 images in theory. 
Moreover, we also observe that the converged solution can 
be degenerated. Eor instance, in the photometric stereo 
problem [45], the solution of A might have a rank lower 
than the theoretical rank of 3. 

In this paper, based on the prior knowledge about the 
rank of A, we propose an alternative objective function 
which minimizes the Partial Sum of Singular Values (ab¬ 
breviated to PSSV) of A: 

min ||A||p=Ar + A||E||i, s.t. O = A + E, (3) 

A,E 

where ||A||p = cr^(A) and N is the target rank of 

A which can be derived from the problem definition, e.g. 
N = 1 for background subtraction, N = 3 for photometric 
stereo. Eq. (3) minimizes the rank of residual errors of A 
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against the target rank, instead of the nuclear norm. A 
major drawback of using the nuclear norm to approximate 
the /^-norm of singular values is that the nuclear norm 
minimizes not only the rank of A, but also the variance 
of A by simultaneously minimizing all the singular values 
of A including the singular values within the target rank 
A, i.e. Consequently, the low-rank structure of A 

may not be well approximated under the environment that 
does not follow the assumption of large number of inputs. 

Although Eq. (3) is non-convex, we observe in our experi¬ 
ments that Eq. (3) encourages the resulting low-rank matrix 
to have a rank close to N even with deficient observations. 
Eor example, when the singular values of A within the 
target rank are small, the nuclear norm can result in a rank 
deficient matrix A, i.e. whose rank is smaller than the target 
rank. In contrast, because our work does not minimize the 
subspace variance of A within the target rank, we are not 
biased to the solution with smaller variance of A. Thus, the 
low-rank matrix A can be more accurately estimated. Eur- 
ther analyses and discussions about this claim are provided 
in the later sections of our paper. 

In contrast to matrix factorization methods where the 
a priori rank constraint is enforced as a hard constraint 
via matrix projection, we encourage the rank constraint as 
a soft constraint and propose the Partial Singular Value 
Thresholding (PSVT) to solve our partial sum singular 
value objective function. As analyzed in our study, the 
PSVT operator encourages the result A to meet the target 
rank even when all the singular values are small. 

This work is the extension of our previous conference 
paper [33]. We empirically study the proposed objective 
function in many low-level vision problems, e.g. HDR 
imaging, motion boundary detection, photometric stereo, 
image alignment, and image recovery where the theoretical 
rank of A is known and the number of observations is 
limited (except the image recovery application). Our exper¬ 
imental analyses show that our formulation, described in 
Eq. (3), converges to a solution more robust to outliers than 
the solution obtained by the objective function in Eq. (2) 
in rank minimization, when the number of observations 
is limited. Empirically, we also find that the solutions of 
Eq. (2) (nuclear norm) and Eq. (3) (our PSSV) are almost 
identical when more than a sufficient number of samples is 
observed. 

In short summary, our contributions are as follows: 

• We present a partial sum objective function and its 
corresponding minimization method for RPCA. 

• We empirically study the partial sum objective function 
and claim that it outperforms the nuclear norm rank 
minimization when the number of observed samples 
is very limited. 

• We present the convergence property of the proposed 
algorithm to minimize the proposed partial sum objec¬ 
tive function consisting of PSSV and sparse term, and 
provide its proof. 

• We apply our technique on various low-level vision 
problems and demonstrate superior results over previ¬ 
ous works. 


2 Related Works 

In this section, we briefly review early works related to 
RPCA, then we discuss some recent advances in RPCA 
and its applications in computer vision. We will also review 
some recent matrix factorization based works for low-rank 
approximation. We invite readers to refer to Candes et al. [9] 
for a thoughtful review of RPCA. 

In conventional PC A [29], the goodness-of-fit of data 
is evaluated by the /^-norm which is very sensitive to 
outliers. Early works in RPCA tried to reduce the effects of 
outliers by random sampling [19], robust M-estimator [12], 
[13], or alternating minimization [30] to identify outliers 
or penalize data with large errors. These methods share 
some limitations: either they are sensitive to the choice of 
parameters or their algorithms are not scalable enough in 
running time. 

Recent advances in RPCA showed that the heuristic 
nuclear norm solution [9], [39], [43] converges to a solution 
which is robust to sparse outliers. Candes et al. [9] proved 
that the original RPCA problem as in Eq. (1) can be solved 
by instead solving the convex relaxation version in Eq. (2), 
and it provides a unique and exact solution of Eq. (1) as long 
as E is sparse and random and the underlying rank(A) is 
lower than a certain upper boundh To solve Eq. (2), vari¬ 
ous methods have been proposed [31], [44]. Among them. 
Alternating Direction Method of Multipliers (ADMM, or 
also called inexact augmented Lagrange multiplier) [31] has 
shown to be computationally efficient. Also, Zhou et al. [49] 
and Agarwal et al. [1] proved that convex approximation by 
nuclear norm can still achieve bounded and stable results 
even under small noise measurements. 

Besides the standard nuclear norm relaxation, some 
works study variants of the nuclear norm to enhance 
performance of rank minimization [11], [20], [24]. Chen et 
al. [11] and Gaiffas et al. [20] proposed an adaptive weighted 
nuclear norm. They suggested a non-trivial update scheme 
to update the adaptive weighted nuclear norm and claimed 
to achieve higher accuracy in low-rank matrix approxima¬ 
tion in comparison with the standard nuclear norm. Hu et 
al. [24] proposed the truncated nuclear norm (TNN) for the 
matrix completion problem which shares a similar objective 
function with our PSSV objective function. Since the TNN 
is non-convex (which is not easy to directly solve), they aim 
to avoid direct minimization by locally approximating TNN 
as minx,w||-V||>, — Tt{AiWBJ ), s.t. X = W hy alternatively 
minimizing Ai^Bi^X and W based on the singular value 
thresholding (SVT) operator [8]. This alternating scheme 
requires outer iterations and additional SVD computations. 
Instead of utilizing this alternating scheme, we propose 
the PSVT operator to directly minimize the partial sum 
of singular value term. Although our objective function is 
also non-convex, our proposed PSVT produces the closed- 
form solution to the sum of the PSSV and proximity term. 
In that sense, every sub-problem of our problem has a 
closed-form solution. Thus, our optimization problem can 
be solved efficiently. Moreover, while the approach of Hu 
et al. is dedicated only to matrix completion, we show that 
our work can be successfully applied for several computer 

1. The bound depends on the matrix size. 
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vision tasks spanning from image alignment to photometric 
stereo and HDR imaging. 

Another branch of low-rank framework is based on Ma¬ 
trix Factorization (MF). Several robust MF methods based 
on /^-norm have been suggested [7], [18], [48]. A benefit 
of matrix factorization approaches is that they can easily 
enforce the rank constraint by the explicit bilinear matrix 
form. The target rank constraint is enforced as a hard 
constraint via matrix reprojection or orthogonal procrustes. 
Cabral et al. [7] revisited the relationship between nuclear 
norm regularization and bilinear MF model [2], and pro¬ 
posed a rank continuation heuristic to avoid local minima. 
Compared with MFs, our target rank constraint is expressed 
as a soft constraint which provides flexibility to balance 
between the rank constraint and other constraints used in 
different computer vision problems. 

The robustness and scalability of the rank minimization 
algorithm for RPC A [9], [31], [44] have inspired many 
applications in computer vision, such as background sub¬ 
traction [9], image and video restoration [28], image align¬ 
ment [38], regular texture analysis [47], and robust pho¬ 
tometric stereo [45]. These applications are based on the 
observation that the underlying structures of clean data are 
linearly correlated, which forms a low-rank data matrix. 
The rank minimization proposed by Candes et al. [9] is 
general in the sense that it does not require to know a priori 
the rank of clean data. However, as briefly mentioned in 
the introduction, in some applications, the rank of clean 
data can be determined by the problem definition, and this 
motivates us to investigate how the prior rank information 
can be fully utilized in the context of RPC A. 

The success of rank minimization based RPCA comes 
from the blessing of dimensionality of input matrix [16], 
[43], implying large amount of observations. However, 
when the number of observations is limited, which is com¬ 
mon in practice, results from RPCA might be degenerated, 
e.g. correct samples might be considered as outliers and 
vice versa. As discussed in the introduction, this happens 
because the standard nuclear norm minimizes not only the 
rank of the matrix, but also the variance of data distribution 
of the matrix. To overcome this limitation, we introduce an 
alternative objective function that can efficiently deal with 
a deficient number of samples in the rank minimization 
problem. Our work can be considered as an addendum 
to the standard rank minimization approach when the 
target rank or the approximate target rank is known. The 
proposed alternative objective function can control the rank 
with a simple and efficient minimizer. We demonstrate the 
effectiveness of our proposed objective function through 
thoughtful experiments. 


3 Partial Sum Minimization by the PSVT 
Operator 

3.1 Derivation of Partiai Sum of Singuiar Vaiues 

Our partial sum formulation in Eq. (3) is originated from 
the following objective function: 

argmin |rank(A) — A| + A||E||o, s.t. O = A + E. (4) 

A,E 


Eq. (4) aims to recover a low-rank matrix A close to the 
target rank N and a sparse error matrix E. 

Since the above objective function is also an NP-hard 
problem, we relax it with an alternative representation in 
order to effectively deal with it. The relaxation is similar 
to the method presented by Candes et al. [9]. We should 
also properly interpret the target rank N. We relax it with 
a projection operator to enforce a rank-A matrix in a 
matrix interpretation manner. From the relaxation, the PSSV 
objective function, which is the first term in Eq. (4), can be 
derived as follows: 


|||A||*-||PAr(A)|U| = 


min(m,n) N 

^ cri(A) - ^C7i(A) 
i=l i=l 

min(m,n) 

^ crj(A) = ||A||p=Ar, 

i=JV+l 


(5) 


where || • ||p=Ar denotes the PSSV with the target rank A, 
and Pr{') is the matrix projection operator to rank-r matrix 
defined as follows. 

Definition 1. [Projection operator] 

P,(X) = U^.XVi.,, (6) 


where Ui:^ and Vi:^ are the matrices consisting of the 
singular vectors corresponding to the r largest singular 
values of X. 


3.1.1 From rank constraint to projection 
Eq. (5) leads us to the PSSV objective function in Eq. (3). In 
this section, we show the relationship between the target 
rank A and the projection operator in Eq. (5). We first 
introduce a rank representation. 

Lemma 1. Let A G and rank(A) > r, then there exist 

matrices C G and B G such that CC^ = = 

I G and 


rank(CAB) = r. (7) 

Proof. Let UDV^ be SVD of A. Suppose C = and B = 
Vi:r, where Ui:^ and Vi:^ are the matrices consisting of 
the singular vectors corresponding to the r largest singular 
values. C and B satisfy rank (CAB) = r, which concludes 
the proof. □ 


The constant r can be represented in the matrix form 
with Lemma 1. Now, we show the characteristics of the 
presented solution by SVD in Lemma 1 with Lemma 2. 


Lemma 2. For any u = {w|wTspan{ui, • • • ,u/c_i}}, v = 
{w|wTspan{vi, • • • , v/c_i}} and A G 


e^k 


= max 

u,v 


|u^Av| 


( 8 ) 


Lemma 2 is the well-known Variational Characterization 
of Singular Values (or Courant-Fischer Min-max principle for 
singular values). By Lemma 2, we see that C and B satisfying 
Lemma 1 are also the unique solution of the following 
problem: 

max ||CAB||* s.t. CC"^=B'^B = I. 

C ,B 


( 9 ) 
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Fig. 1: Illustrations of the potential problems in minimizing nuclear norm (all 
singular values). The ground truth subspace (green) is a 1D line corrupted 
with sparse outliers and noise. In (a), the estimated subspace is biased to 
the estimated axis that has a smaller nuclear norm but a second singular 
value larger than the ground truth coordinate. In (b), some inliers located 
on the ground truth sub-space are regarded as outliers to achieve a smaller 
singular value. 




(a) (b) 

Fig. 2: A toy example comparison between the nuclear norm solution and 
the PSSV solution. Y-axis represents the magnitude of the nuclear norm 
and PSSV. The red dots represent the minimum points of the magnitude. 
The graphs show the nuclear norm and the PSSV of 2 x 2 matrices A = 
[ll]3x] in (a) and [11; lx] in (b) with x varying from 0 to 4. In (a), the 
minimum point of nuclear norm is at x = 1 where the singular values of A 
are equal to [3.4142,0.5858] (i.e., rank-2). As for the PSSV, the minimum 
point is at X = 3 with singular values equal to [4.4721,0.0000] (i.e., rank- 
1). In this toy example, the nuclear norm favors a rank-2 solution over a 
rank-1 solution because the rank-2 solution provides the minimum nuclear 
norm. In contrast, the PSSV achieves the lowest rank (rank-1) solution. In 
(b), when the basis of the first row of A is partially supported by another 
sample (second row), the nuclear norm and the PSSV both achieve the 
rank-1 solution at minima. 

Though the solution satisfying Lemma 1 is not unique, 
the solution of Eq. (9) can be a way to represent the target 
rank constraint. Therefore, we relax the target rank constant 
to the nuclear norm representation in Eq. (9) with Ui:^ and 
Vi:^ which satisfy both Lemma 1 and Eq. (9). In summary, 
we show the first term in Eq. (4) can be relaxed as PSSV 
by Lemmas 1 and 2, and the introduced projection operator 
Pr{') (see Definition 1) favors preserving the information of 
the r largest singular values. Intuitively, when rank(A) > r, 
as cri>Ar(A) ^ 0 (namely minimized), rank(A) ^ N. Of 
course, if rank(O) < A, i.e. if the inputs are degenerated, 
the rank of A cannot be increased to meet the target 
rank. This is a fundamental limitation. In common cases 
where input data contains noise or outliers, the inequality 
condition rank(A) > r is easily satisfied, because data 
corruptions increase the rank of input data. 


degrade the accuracy of the estimated low-rank subspace. 
The bias phenomenon by a convex surrogate is common, 
and it could be corrected by non-convex relaxation [46]. 
An additional issue is that if the ground truth has a large 
variance but a sparse distribution within the ground truth 
subspace, some inliers can be regarded as outliers in order 
to reduce the singular values within the target rank, as 
illustrated in Eig. l-(b) and at the minimum point of nuclear 
norm in Eig. 2-(a). These two problems are not an issue 
when there is a lot of observed data to support the correct 
estimation of A. However, when observed data is limited, 
minimizing nuclear norm can easily cause bias since there 
is not a sufficient number of truth samples to support large 
variance of A within the target rank. In contrast, the PSSV 
does not assume small variance of A, and it only minimizes 
variances in residual rank which corresponds to minimizing 
the noise variance of observed data. Note that the original 
rank operator, rank(A), in Eq. (1) does not favor small 
variance solution. 

3.2 Optimization by ADMM 

Our partial sum objective function in Eq. (3) forms a 
constrained optimization problem. To solve this type of 
problems, Lin et al. [31] proposed an ADMM method (or 
called inexact augmented Lagrange multipliers, iALM). The 
augmented Lagrangian function of Eq. (3) is formulated by: 

L^(A,E,Z) = ||A||p=,v + A||E||i (10) 

+{Z,0-A-E) + |||0-A-E||^, 

where /i is a positive scalar, Z G is an estimate of 

the Lagrange multiplier, || • ||f denotes the Erobenius norm, 
and (•,•) represents the inner product operator. Directly 
minimizing the Lagrangian function might be particularly 
challenging. According to a recent development of alter¬ 
nating direction method [31], Eq. (10) can be solved by 
minimizing each variable alternatively while fixing the 
other variables. The optimization problem can be divided 
into two sub-problems: 

A* =argininL^j,(A,Efc,Zfc) 

A 

1 2 
= argmin/x^^||A||p=iv + -llA - (O - Efe + /x^^Zfc)|| 

A ^ 

( 11 ) 

E* =argminL^^(Afe+i,E, Zfe) 

E 

= argminA/i^:i||E||i + ^ ||E - (O - A^+i +/i^:iZfe)||^, 

( 12 ) 

where k indicates the iteration index (see Alg. 1). 


3.1.2 Why the partial sum of singular values ? 

A major advantage of using the PSSV over the nuclear 
norm is that it does not minimize the variance of data 
distribution within the target rank. Minimizing the nuclear 
norm can favor a solution that has a lower nuclear norm, 
but the singular values in residual ranks (singular values 
above the target rank N. Let N=1 here) can be still large 
as illustrated in Pig. l-(a) and Pig. 2-(a). This bias can 


3.3 Solving A* 

To minimize Eq. (11), we define the Partial Singular Value 
Thresholding (PSVT) operator P 7 v,r[']- Before defining the 
PSVT, we first introduce the von Neumann's lemma (see 
details in de Sa et al. [14]). 

Lemma 3 (von Neumann [14]). For any matrices B, Z G 
^mxn vectors of the singular values cr('), the following 
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equality holds: 

max{(UZV^,B)|UeW™,VGW„} = (a{Z),a{B)), (13) 

where Un denotes the set ofn x n unitary matrices, and (A, B) = 
Tr(A^B),/or any matrix A G Hence 

(A,B)<(a(A),a(B)). (14) 

Moreover, equality holds in Eq. (14) iff there exists a simultaneous 
SVD U and of A and B in the following form: 

A = Udiag (cr(A)) and B = Udiag (cr(B)) V^. (15) 

The von Neumann's lemma shows that (A, B) is always 
bounded by the inner product of o-(A) and o-(B). Notice 
that the maximum value of (A,B) can be only achieved 
when A has the same singular vector matrices U and V as 
B. This fact is useful to derive the PSVT. 

Theorem 1 (PSVT). Let r > 0, I = min(m,n) and X,Y G 
^mxn decomposed by SVD. Y can be considered 

as the sum of two matrices, Y = Yi + Y 2 = UyiDyiV/^^ + 
Uy 2 Dy 2 V/ 2 / ii^here Uyi, Vyi are the singular vector matrices 
corresponding to the N largest singular values by SVD, and 
Uy 2 ,My 2 from the (N-\-l)-th to the last singular values. Define 
a minimization problem for the PSSV as 

argmin i||X-Y||^ + r||X||p=jv. (16) 

X Z 

Then, the optimal solution of Eq. (16) can be expressed by the 
PSVT operator defined as: 


PiV.x[Y] =Ur(Dyi + Sr[nY2]Wi 

=Yi+Uy2>S,[Dy2]Vj2, ^ 

where 

Dyi = diag(cri, • • • , ctat, 0, • • • ,0), 

Dy 2 = diag(0, • • • ,0, ctat+i, • • • , cr/), 

and Sr[x] = sign{x) • max(|x| — r,0) is the soft-thresholding 
operator [171, [22]. 

Proof. Let's consider X = 

where Ux = [uiy" ^Um] E Um, Vx = bl,*" ,'^n] e Un 

and Dx = diag(cr(X)), where the singular values afr) = 
[cri('),"- > 0 are sorted in a non-increasing order. 

Also we define the function J(X) as the objective function 
of Eq. (16). The first term of Eq. (16) can be derived as 
follows: 

i||X-Y||^ = i (||Y||^ -2(X, Y) + IIXIII.) 

||Yf^ - 2 ^ aiiX)uJYvi + ^ a^iXf 

i=l i = l 

fr I|Y|If + ^ E (-2<Ti(X)ulYvi + <t,(X)2) . 

^ i=l 

In the minimization of Eq. (18) with respect to X, || Y||^ is 
regarded as a constant and thus can be ignored. Eor a more 




detailed representation, we change the parameterization of 
X to (Ux, Vx,Dx) and minimize the function: 


J(Ux,Vx,Dx)= (19) 

. i i 

-Y,{-2<Ti{X)ujYvi + ai{Xf)+T ^ ai{X). 

i=l i=V+l 

Erom von Neumann's lemma, the upper bound of uj YVi 
is given as cri(Y) = meLx{ujYvi} for all i when Ux = 
Uy and Vx = Vy. The lower envelope of J(Ux, Vx, Dx) 
is obtained at Ux = Uy and Vx = Vy. Then Eq. (19) 
becomes a function depending only on Dx as follows: 


J(Uy,Vy,Dx) (20) 

= ^E(-2^*(XK(Y) + ^*(X)2)+t E 

^ i=l i=Ar+l 

1 ( ^ 

\i=i 

+ (-2<7i(X)<Ti(Y)+<7i(X)2+2T<Ti(X)) j . 

i=NYl J 


Since Eq. (20) consists of simple quadratic equations for 
each cri(X) independently, it is trivial to show that the 
minimum of Eq. (20) is obtained at Dx = diag(d(X)) by 
derivative in a feasible domain as the first-order optimality 
condition, where di(X) is defined as 


f (Ti{Y), ifi<Y + l, 

I max (cr^(Y) — r, 0), otherwise. 


( 21 ) 


Hence, the solution of Eq. (16) is X* = UyDxVY- This 
result exactly corresponds to the PSVT operator where a 
feasible solution X* = Uy(Dyi + <Sr[Dy 2 ])V/ exists. □ 


Our proposed PSVT can be regarded as a special case of 
solving the weighted nuclear norm based objective function 
of Chen et al. [11] and Gaiffas et al. [20]. But we would 
like to notice that our method suggests how the weighted 
parameter (defined in their literatures) can be determined 
to encourage the rank constraint. Also, notice that our 
proposed PSVT operator provides a closed-form solution 
for systems of the same form as Eq. (16) (e.g. Eq. (11)). 
While Eq. (11) is a non-convex function, the PSVT provides 
a global optimal solution for the sub-problem of A (see the 
proof of Theorem 1). 

As an analysis of PSVT, when r = oo, the optimal 
solution of Eq. (16) is a low-dimensional projection of Y 
known as singular value projection [27] which enforces the 
target rank constraint through projection. When < r for 
I < i < N, conventional SVT [8] projects these to zero 
resulting in a more deficient rank of A than the target rank 
while our PSVT does not lead to rank deficient matrices. 
Hence, PSVT implicitly encourages the resulting matrix A 
to meet the target rank even when all the are small, 
which occasionally happens when the number of observed 
samples is limited. 
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3.4 Solving E* 

As suggested by Hale et al. [22], the solution to the sub¬ 
problem in Eq. (12) can be obtained as: 

5,[Y] = argmin hx - Y||^ + t||X||i, (22) 

X ^ 

where Sr[x] = sign(x)max(|x| — r, 0) is the soft-thresholding 
operator [17], [22], and X G M. This operator can be extended 
to vectors and matrices by applying it element-wise. The 
soft-thresholding (shrinkage) method is shown to be very 
effective in minimizing /^-norm and the proximity term, 
and guarantees that the solution is the global minimum for 
the equations of the same form as Eq. (22) (e.g. Eq. (12)) [17], 
[ 22 ]. 


3.5 Updating A* and E* 

At each iteration k, and E/^ can be updated with the 
operators Sr['] and IPAr,r['] as: 

A/c+i = -i[0 - E/e + /i/e“^Z/e], 

(23) 

E/e+l = [O - A/e+1 + /i/e“^Z/e]. 

The iterations are terminated when the equality constraint 
is satisfied (in all the experiments, < le“^). 

Experiments showed that updating A/e and E/e for only 
one iteration in the inner loop is sufficient to produce a 
satisfying accurate solution of Eq. (3). This method is called 
the inexact ALM [31] and is designed for computational 
efficiency. 

We summarize the overall algorithm in Alg. 1 (Eor more 
details, refer to the report of Lin et al. [31]). 


Algorithm 1 ADMM for the PSSV based RPCA 

Input : O G ^ , A > 0, the constraint rank N. 

Initialize Aq = Eq = 0, Z as suggested in [31], /jq > 0, p > 1 and 
k = 0. 

/ / Outer loop 
while not converged do 
/ / Inner loop 
while not converged do 

^jsj [t) Efc + ^Zfc]. 

Efc-|-i = 1 ak 

end while 

ak-\-i = pi^k- 
k = k-\- 1. 

end while 
Output: (A/e,E/e). 


3.6 Convergence Analysis 

To the best of our knowledge, the general convergence 
property of ADMM which alternates between non-convex 
(solving A*) and convex (solving E*) functions has not been 
answered yet. The ADMM for non-convex problems can be 
considered as a local optimization method, which aims to 
converge to a point with better objective value [4]. 

In our problem, each sub-problem has a closed-form 
solution and the objective value is always decreasing with 
respect to the primal variables optimized in each sub¬ 
problem iteration^. Our empirical convergence tests showed 

2. It does not mean a monotonic decrease of the Lagrangian function, 
which is not necessarily monotone due to the dual update. 



Fig. 4: Success ratio for synthetic data with a varying number of rows 
(dimension) m (a-d). Comparison between RPCA (nuclear norm) and ours 
(PSSV) for the rank-1 case (a,b), and for the rank-3 case (c,d). Y-axis 
represents the corruption ratio r g [0,0.4]. X-axis represents the log scale 
row size log^o^ G [log^o 100, log^Q 12800] in (a-d). The color magnitude 
represents the success ratio [0,1]. 

10 ° 

I -5 

I 


0.2 0.4 0.6 0.8 

Outlier Ratio 

Fig. 5: NRMSE comparison on a sufficient sample condition with a rank- 
3 matrix O G riooooxsooo under the sufficient sample case, the nuclear 
norm and PSSV solutions are very similar. 



that our ADMM based algorithm has a strong convergence 
behavior (see Sec. 4.1). Although the global optimal solution 
is not guaranteed, all of our experiments showed that our 
algorithm converges to a solution which is very close to 
the nuclear norm solution, when the number of observa¬ 
tions is more than sufficient. It also converges to a better 
solution than the nuclear norm solution when the number 
of observations is limited, even with all zero initializations. 

Besides the empirical behavior, we provide the conver¬ 
gence property for Alg. 1 in Proposition 1. It shows that 
any accumulation point (limit point) generated along the 
iterations satisfies the first-order necessary optimal condi¬ 
tion, a KKT (Karush-Kuhn-Tucker) point. 

Proposition 1 (Convergence). Let Sk = (A/^, E/., Y/., Y/.), 
where Y/,+i = Y/. + /i/c(0 - A/,+i - E/^) and {5'/e}^i is a 
set of intermediate solutions of Alg. 1. Suppose that {Y/.}^^ 
and {Y4r=i are bounded, lim (Y/^+i —Y^) = 0, and pk is 

k^oo 

non-decreasing, then any accumulation point of satisfies 

the following KKT conditions: (Cl) Y* G d^\\Af\p, (C2) Y* G 
d||AE*||i,(C3)0-A*-E* =0, (C4) d^||A*||pnd||AE*||i 7^0. 
In particular, whenever converges, it converges to a 

KKT point of Eq. (3). 

The proofs can be found in the supplementary material. 
The conditions for non-decreasing pk and the boundness of 
the sequence are already satisfied by Alg. 1 (see Lemma 1 
in Lin et al. [31]). Proposition 1 is established for a single 
iteration algorithm in the inner loop, i.e. iALM. When the 
inner loop iterates until convergence (exactly solving the 
inner loop), it may lead a simpler proof than the above re¬ 
sult. We remain further theoretical analyses of convergence 
as future work. 


4 Experiment Results 

We compare the performance of the proposed method 
against RPCA (nuclear norm) [9] with synthetic data sets 
and real world application examples. In all the experi¬ 
ments, we use the default parameters recommended by 
Candes et al. [9] for both their approach and ours, i.e. A = 
If x/max{m^ n) and p = 1.5, except if explicitly stated 
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Fig. 3: Success ratio for synthetic data with varying the number of columns (observations) n. Comparison between RPCA (nuclear norm) and ours (PSSV) 
for rank-1,2,3,5,10 cases. X-axis represents the column size, and Y-axis represents the corruption ratio r e [0,0.4]. The color magnitude represents the 
success ratio [0,1]. The white dotted lines are provided as a guide for easier comparison. 
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Fig. 6: Comparison for the rank deficiency of the estimated low-rank matrix 
A for the rank-3 case, obtained by RPCA (a) and our method (b). The red 
regions indicate rank deficiency, i.e. the rank of the recovered matrix is lower 
than the constraint rank. X-axis represents the column size, and Y-axis 
represents the corruption ratio r e [0,0.4]. The color magnitude represents 
the success ratio [0,1]. 

otherwise. The code of the proposed method is available 
on our project website 


4.1 Synthetic Dataset 

We compare our method (PSSV) with RPCA (nuclear norm) 
on synthetic data by evaluating the success ratio and 
convergence behaviors. To synthesize a ground-truth low- 
rank matrix Aqt ^ of rank-A, we perform a linear 

combination of N arbitrary orthogonal basis vector. The 
weight vector used to span each column vector of Act 
is randomly sampled from the uniform distribution f/[0,1]. 
To generate sparse outliers, we select mxnxr entries from 
AcTf where r denotes the corruption ratio. Larger r means 
more outlier entries. The selected entries are corrupted by 
random noise from f/[0,1]. We run each of the tests over 50 
trials and report the overall average errors of the trials. We 

refer to normalized root mean squared 

error (NRMSE). 

4.1.1 Comparison of Success Ratio 

We verify the robustness of RPCA (nuclear norm) and 
the proposed method (PSSV) with respect to the number 
of observations, data dimension and corruption ratio. We 
examine the performance by counting the number of suc¬ 
cesses. If the recovered A has a NRMSE smaller than 0.01, 
we consider the estimation of A and E is successful. We 
compare the success ratio with varying column size n (i.e. 
the number of observations), and row size m (i.e. data 
dimension). The magnitude in Eig. 3 indicates the success 

3. http://thoh.kaist.ac.kr 
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Fig. 7: Distribution of residual errors with 1000 different random initializa¬ 
tions. 

percentage. A larger blue area indicates a more robust 
performance of the algorithm. 

We also perform experiments where we fix m = 10,000 
and vary n and r. The comparison between RPCA and our 
method with rank-1,2,3,5 and -10 constraints is shown in 
Pig. 3. As n decreases (i.e. the number of observations de¬ 
creases), the success ratio of RPCA decreases more rapidly 
than our method. When more observations are available 
(over n = 25 in Pig. 3 and Pig. 5), both methods show a 
similar behavior. 

Pig. 4 shows the success ratio of RPCA and ours for the 
varying row m cases. We fix n = 16, and vary m and r. Our 
method can successfully recover A and E with input data 
contaminated up to 15% of severe corruption for the rank- 
1 case in Pig. 4-(b), and leads to more robust results than 
RPCA despite 5% higher corruption for the rank-3 case in 
Fig. 4-(d). 

4.1.2 Rank Deficiency 

We verify whether the recovered A obtained by RPCA 
and our method is rank deficient. Our objective function 
minimizes the rank of A up to the target rank. Thus, the 
result rank of A should not be lower than the target rank. 
In practice, rank deficiency is crucial for quality of the final 
solution in some applications (e.g. photometric stereo). We 
measure the ratio crAr(A)/cri(A) (similar to the inverse value 
of the condition number) for the rank-A constraint case. We 
only test for rank-A = 3 as a typical example of photometric 
stereo. If the ratio is lower than 0.01, we consider that the 
recovered matrix has a rank lower than A. In Pig. 6, the red 
regions mean that the rank of the recovered matrix is lower 
than the target rank. The experiments empirically validate 
that the rank obtained by our method is bounded for almost 
all of the regions, while RPCA has regions whose rank is 
lower than the target rank. This happens when observations 
do not support its true subspaces well. 

















Method 

Objective function 

Constraint 

Eriksson et al. [18] 

min IIO - UVlh 
u,v 

- 

Zheng et al. [48] 

min||0-UV||i+A||V||. 

U ' U = I 

LMaFit [40] 

min IIO — All 1 

A,U,V " " 

A = UV 

SVP [27] based 
RPCA 

min||E||i 

O = A + E, 

rank(A) = N 

RPCA [9] 

min||A|U+A||E||i 

O = A + E 

WNNM [11] 

based RPCA 

min ||A||w,=. + A||E||i 

O = A + E 

Our method 

min ||A||p= 7V + A||E||i 

O = A + E 


TABLE 1: Summary of the compared methods. 0,A,E g e 

l^mxA^ Y G II • ||w,* is the weighted nuclear norm and the weight 

coefficients in w is determined adaptively as suggested by Chen etal. [11]. 

4.1.3 Sensitivity to Initialization 

Since the proposed objective function is non-convex, the 
converged solution may be different according to the ini¬ 
tialization. To study the sensitivity of the optimization 
against the initialization, we conducted 1000 experiments 
with random initialization on a rank-3 matrix O G 
with 5% outliers. The distribution of NRMSE is shown in 
Fig. 7. While the convergence of non-convex problem to 
an optimum is hard to be guaranteed, most solutions are 
concentrically distributed in regions near the ground-truth 
solution with small errors. 

4.1.4 Comparisons with other low-rank approximations 
We provide additional comparisons with the singular value 
projection (SVP) [27] based and the weighted nuclear norm 
(WNNM) [11] based methods, and low-rank matrix approx¬ 
imation approaches by MR The formulations are summa¬ 
rized in Table 1. The SVP and WNNM are reformulated 
based on RPCA framework for fair comparison. MF meth¬ 
ods enforce the target rank N constraint of data matrix 
(O = UV) by factorizing it into a product of rank-A^ 
basis (U) and coefficient (V) as hard constraint. Among 
the existing MF based methods, we compare with the 
state-of-the-art methods of LMaFit [40], Zheng et al. [48] 
and Eriksson et al. [18], with the default recommended 
parameters. 

Since the method of Eriksson et al. can only handle small 
size examples, we perform separate experiments for small 
and large scales. We synthetically generate data matrices 
M^oxT rank-2 for small scale or with rank-3 

for large scale, and varying corruption ratio in [0.05,0.20]. 
NRMSE is displayed in Fig. 8-(a,b). 

Compared to our method, their approach also minimizes 
the nuclear norm in addition to the hard target rank 
constraint. As discussed previously, since minimizing the 
nuclear norm also implicitly minimizes the variance of 
the estimated low-rank matrix, their estimated low-rank 
matrix could be biased by this assumption. On the other 
hand, since our PSSV objective function does not have this 
assumption, and since the target rank is penalized softly, 
our method converges to more accurate solutions compared 
to the solutions of LMaFit, Zheng et al. and Eriksson et al. 

We have also conducted experiments for the under¬ 
sampled cases on subspace: e.g. a ground truth data is 
spanned with 3 basis axis (true and target rank are 3), 
but the distribution along the third basis axis has a very 
small variance (small singular value). Thus, although the 
underlying matrix is a rank-3 matrix, it is very close 




(a) Small scale, well sampled (b) Large scale, well sampled 



(c) Small scale, under sampled (d) Large scale, under sampled 


Fig. 8: Accuracy comparisons with varying outlier ratio and deficient number 
of samples for SVP [27] based and WNNM [11 ] based methods, LMaFit [40], 
Zheng et al. [48], Eriksson et al. [18], and our method. The experiments 
consist of small scale problems (O g rank-2) in (a,c) and large 

scale problems (O g m5ooox2o rank-3) in (b,d). The cases with well- 
sampled and under-sampled data on subspaces are shown at the top and 
bottom rows respectively. X-axis represents the percentage of outlier, and 
Y-axis represents the average error. LMaFit, Zheng etal. and Eriksson etal. 
are MF methods. MF methods also result in low accuracy under the case 
of deficient number of samples. Comparing (b) and (d), MF methods are 
prone to the data under-sampled on subspaces, because bilinear model 
enforcedly constrains the target rank and excessively attempts to match it. 



(a) Well sampled. (b) Under sampled. 

Fig. 9: Effects when the target rank is incorrectly set. We set the input target 
rank N = Ntrue + ^offset- where the truth rank Atme = 3. The lower value the 
better. 

to a rank-2 matrix. This situation often happens when 
the last basis is less supported by a few true samples. 
This is also called the unbalanced singular values case 
(e.g. cr(A) = [100,10, le“^]). Our results shown in Fig. 8- 
(c,d) have smaller errors than results from LMaFit, Zheng et 
al. and Eriksson et al., even for the under-sampled cases. 

4.1.5 Incorrect Setting of Target Rank 
Our method takes advantage of the target rank from the 
problem definition. When the target rank is set incorrectly, 
the question of the behavior of our method naturally arises. 
For the sake of completeness, we have experimented with 
incorrect target rank setting in Fig. 9. 

We considered the situation where the rank is known, 
but ambiguous within some bound (e.g., the truth rank is 3, 
but ambiguous within rank-{2,3,4}). The data construction 
is similar to the experiments conducted in Sec. 4.1.4, i.e. 
well-sample and under-sampled data cases. The rank-3 
matrices O G m 30 ooxioo used for experiment. Fig. 9 
shows that MF based methods are prone to incorrect target 
rank setting. Interestingly, for the data under-sampled on 
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(a) Stop criterion (b) Relative Error 


Fig. 10: Convergence behavior of RPCA [31] and our method for the rank 
2,3 and 4. 



Fig. 11: Evolution of the NRMSE according to varying A = L/^inm{m,n). 
For experiments, rank-3 matrices O g are generated with 5% 

outliers. 

subspaces, MF based methods with a target rank lower 
than the true rank show better performance than for the 
well-sampled data case. This is because the bilinear model 
enforcedly constrains the target rank within the bilinear 
matrix structure. Therefore, when the 3^^^ basis is weakly 
supported by samples, fitting with a rank-2 bilinear model 
only for the and 2^*^ basis provides better precision than 
using a rank-3 bilinear model. This result is consistent with 
Fig. 8-(b,d). 

4.1.6 Convergence Behavior 

To examine the convergence behavior of both RPCA [31] 
and our method, we plot the evolution of the relative errors 
I|Egt-e||^ termination criteria 
over the iterations in Fig. lO-(a) and (b), respectively. We 
randomly generate 5000 x 40 matrices for the rank-2,3,4 
cases, and the average value over the trials is computed. 

We use the MATLAB implementation of RPCA provided 
by Wright et al. [44]. We run our method until convergence, 
and we observe that it is terminated at similar moments 
with RPCA as shown in Fig. lO-(a). Also, our method takes 
the same amount of time as inexact ALM based RPCA [31]. 
Fig. lO-(b) also shows that our method provides higher 
accuracy than RPCA as well as a gradual convergence 
under the same termination criterion. 

4.1.7 Lambda (X) parameter 

We conduct all the experiments in this paper with the 
same A parameter recommended by Candes et al. [9]. For 
completeness, we show in this section how the choice of 
A can affect the solution of both RPCA and ours. Note 
that tuning the optimal A to balance the nuclear norm and 
sparsity is not possible unless the ground truth solution is 
known as discussed by Chandrasekaran et al. [10]. Thus, the 
results provided here are only for reference. Fig. 11 shows 
normalized MSE when A varies, where A = L/^/max{m,n). 
The results show that our method consistently produces 
less errors than RPCA under different settings of A. 




Exposure A t 


Fig. 12: Illustration of the observed intensity values for (a) saturation 
region, (b) moving object, and (c) consistent cases. Solid lines denote ideal 
relationship between intensity and exposure, and dots and dotted lines 
denote the observed intensities. 



(a) (b) (c) (d) (e) 


Fig. 13: Comparison of the low-rank matrix and sparse error results be¬ 
tween RPCA and ours on the Arch dataset [21]. (a) Three out of the five 
input multi-exposure image samples. Low-rank (b,cl) and sparse error (c,e) 
results, respectively obtained by RPCA (b,c) and the proposed approach 
(d,e). 

4.2 Real-world Applications 

4.2.1 High Dynamic Range (HDR) imaging 

We apply the proposed method for modeling a background 
scene and a ghost-free HDR composition. The input is a 
set of low dynamic range (LDR) images and the goal is 
to composite an HDR image using RPCA to reject out¬ 
liers, such as moving objects and saturations, in the LDR 
images. We assume that the differently exposed images 
li are aligned and the camera response function (CRF) is 
calibrated (or linear). Then, the captured images can be 
represented as li = hcRAti, where R denotes the sensor 
irradiance, Ati is the exposure time of the i-th image, and 
K is a positive scalar. We construct the observed intensity 
matrix O G = [vec(/i)| ••• |vec(/n)] by stacking the 

vectorized input images, where m and n are the number 
of pixels and images respectively. Since the intensities of 
the input images are linearly dependent, the ideal solution 
of this problem is rank-1. However, in practice, rank(O) is 
higher than 1 due to moving objects, saturation or other 
artifacts (illustrated in Fig. 12). We apply RPCA (nuclear 
norm) and our method (PSSV) to each color channel in¬ 
dependently, in order to separate artifacts and background 
scene. 

The Arch and Sculpture Garden datasets from Gallo et 
al. [21] are used for evaluation. The estimated backgrounds 
as low-rank matrix and the sparse outliers from RPCA 
and our method are shown in Fig. 13. The example in 
Fig. 13-(a) consists of only 5 input images which is very 
limited. Ideally, the decomposed low-rank matrix A = 
[vec(Ai)| • • • |vec(An)] consists of relative intensities of the 
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(d) (e) 

Fig. 14: HDR composition results for the Arch (top) and Sculpture Garden 
(bottom) datasets [21]. (a) Debevec et al. [15]. (b,d) RPCA [31]. (c,e) the 
proposed method. 



Frame 155 Frame 295 Frame 316 


Fig. 15: Motion detection comparison between RPCA and our method on 
a varying illumination dataset [34]. (a) Representative sampled inputs, (b-c) 
Results of RPCA, with n = 5 and 100 respectively, (d-e) Our results, with 
n = 5 and 100 respectively. 

background scene from which moving objects or satura¬ 
tion artifacts should be removed (see Fig. 13-(b,d)). RPCA 
returns a low-rank matrix whose magnitude differs drasti¬ 
cally from the input image, as shown in Fig. 13-(b). More¬ 
over RPCA yields a dense non-zero entries in E, instead 
of being sparse, as shown in Fig. 13-(c). This situation is 
similar to the example in Fig. 2 where the minimum nuclear 
norm favors a solution with smaller variance of magni¬ 
tudes. In contrast, our proposed method shows a correctly 
modeled background scene and successfully detects outlier 
regions, as shown in Fig. 13-(d,e). For displaying the sparse 
components in Fig. 13-(c,e), each color component (R,G,B) 
is set with (|Ei^|, lE^j, jE^I), where E{^ denotes sparse 
error matrix for each channel. 

After we estimate the low-rank matrix, we composite 
the HDR images using the standard method of Debevec et 
al. [15]. The final HDR results are shown in Fig. 14. Because 
the background modelling by RPCA is inaccurate, ghosting 
appears in their HDR results. In contrast, our results are 
ghost-free. 
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(Full rank) (Rank-3) Outliers 


Fig. 16: Photometric stereo illustration for the used model. 



(a) (b) (c) (d) (e) 

Fig. 17: Outlier rejection results for Photometric Stereo, (a) Three sampled 
input images out of five, (b-c) low-rank and sparse image from Wu etal. [45]. 
(d-e) low-rank and sparse image from ours. 

4.2.2 Motion Detection by Temporai Edge 

RPCA-based background modeling for surveillance pur¬ 
pose requires a large number of observations to estimate 
background and moving objects under global illumination 
changes. Such requirement is not suitable for online algo¬ 
rithm in surveillance. Using a few images as input, the mov¬ 
ing region detection by RPCA could fail due to the limited 
number of observations. In this problem, we observe that 
edge images make moving object boundaries more sparse 
and they rarely overlap. We stack a few n edge images 
(obtained by Sobel operator) in video sequence as column 
vectors of a matrix O G = [vec(Oi)| ••• |vec(On)]- 

Without moving objects, the edge pixels on the background 
texture are static, so the matrix O should be low-rank, 
essentially, rank-1. Since moving object regions are not con¬ 
sistent with background edges, the regions can be modeled 
as sparse outliers. 

Fig. 15 shows the comparisons with RPCA and the 
proposed method. RPCA fails to decompose low-rank and 
sparse matrix in Fig. 15-(b) due to deficient observations 
where n = 5. On the other hand, our method successfully 
estimates moving object boundary, and the results are sim¬ 
ilar to the one obtained with many observations in Fig. 15- 
(e). 



(a) LS (b) Wu etal. (c) Ours (d) LS (e) Wu et al. (f) Ours 
(n=5) (n=5) (n=5) (n=10) (n=10) (n=10) 


Fig. 18: Results of the photometric normal estimation and depth by LS 
method [42], Wu etal. [45] and our method. Top row: the normal estimation. 
Bottom row: the reconstructed depth from the estimated normals, n denotes 
the number of input images. 





















11 



cri 

0-2 

crs 

0-4 

cr5 

Input aiiOc) 

137.88 

23.59 

19.55 

15.56 

12.99 

RPCA <Ti(Ao) 
in Fig. 18-(b) 

125.71 

7.26 

0.0001 

0.00 

0.00 

Ours ai(AG) in 
Fig. 18-(c) 

139.16 

23.01 

16.33 

1.59 

0.15 


TABLE 2: Singular values of photometric stereo input for n = 5 in Fig. 17 
and Fig. 18-(b,c). 

4.2.3 Outlier Rejection for Photometric Stereo 

Intensity observation is modeled in Lambertian photomet¬ 
ric stereo as O = [vec(Oi)| ••• |vec(On)] = N^L^, where 
O G N G and L G denote measured in¬ 

tensity, normal and light direction matrix, respectively, and 
m and n are the number of pixels and images. Hay aka wa et 
al. [23] show that the intensity matrix lies in a subspace of 
rank 3, as illustrated in Fig. 16. However, this constraint 
is hardly satisfied in real situations due to shadow from 
self-occlusion, saturation and some object materials which 
do not exactly follow the Lambertian diffusion model. 
Considering the rank-3 constraint, the artifacts mentioned 
above can be regarded as sparse outliers and we get a low- 
rank structure asO = N^L + E. 

The robust photometric stereo with outlier rejection can 
be formulated as a RPCA problem as suggested by Wu et al. 
[45]. We compare our method with the standard least square 
(LS) method [42] and RPCA by Wu et al. [45]. Among them, 
Wu et al. and our method do not require light information 
(i.e. uncalibrated setting), while the LS method requires 
light calibration a priori. Thus, RPCA based outlier rejection 
is a more challenging problem than the robust regression 
given light information [26]. The LS based photometric 
stereo estimates the normals by minimizing ||0 — N^L|||.. 
We corrupt some input images by painted artifacts to mimic 
outliers. The corrupted inputs are included in 2 out of n = 5 
inputs (Fig. 17 and Fig. 18-(top)), and 4 out of n = 10 inputs 
(Fig. 18-(bottom)). Outlier rejection results are shown in 
Fig. 17. We present qualitative comparison of normal recov¬ 
ery results in Fig. 17 and Fig. 18. Wu et al. return a planar 
surface normal when the rank of input matrix is lower than 
3 due to the lack of observations (italic in Table 2). When 
more input images are available, RPCA begins to return 
detail preserved results, as shown in Fig. 18-(e). On the 
other hand, our method consistently provides robust results 
for both limited and sufficient observations, as shown in 
Fig. 18-(c,f). 

For quantitative results, we use the Bunny dataset [26] 
generated using the Cook-Torrance reflectance model and 
consisting of 40 different lighting conditions. The average 
ratio of specular and shadow regions in Bunny are 8.4% 
and 24% respectively, which act as outliers. Table 3 shows 
quantitative results. We vary the number of images and 
add 5% of uniformly distributed corruption. Each value in 
Table 3 is averaged over 20 randomly selected test sets. 
Wu et al. [45] produce degenerated results, as the rank 
of the resulting matrix is lower than 3 due to the lack of 
supports from the observations. When more input images 
are available, RPCA returns more satisfying results, but 
still the accuracy is lower than LS method. In contrast, 

4. Note that the intensities only on the object region are used in the 
observation matrix O with the corresponding object mask. 



Observations 0 Clean aligned Images A Errors E 
(full-rank) (rank-1) (sparse) 



(a) (b) (c) (d) (e) (f) (g) 


Fig. 20: Batch image alignment experiments, (a) Three input images, (b- 
d) The aligned, low-rank and sparse results by Peng et al. [38]. (e-g) The 
aligned, low-rank and sparse results by the proposed method. The images 
in the illustration on the top row are from AR dataset [32]. 

our method provides accurate results for both limited and 
sufficient observations. 

4.2.4 Batch Image Alignment 

Given several images of an object of interest (e.g. face), the 
batch image alignment task aims to align them to a fixed 
canonical template [6], [38]. In this problem, we search for 
a transformation gi for each image li to make the images 
linearly correlated. We note g the set of transformations: 
g = {gi,...,gn} where n is the number of images 

and write O o g = [vec(/i o g^)\ • • • |vec(/n o g^)]. Contrary 
to the formulation of Peng et al. [38], we consider PSSV 
mathematically formulated as follows: 

argmin||A||p=Ar + A||E||i, s.t. O o g = A + E. (24) 

A,E,g 

We applied our approach to the head dataset acquired 
under varying poses (see Fig. 20-(a)) [38]. For linearly 
correlated noise-free batch images, the rank is A = 1, 
when the transformations for exact image alignment are 
estimated. Our results of alignment, low-rank estimation 
and error sparsity are shown in Fig. 20-(e,f,g). Compared to 
the results obtained by RASL [38], our method can correctly 
detect the outliers (Fig. 20-(c) vs. Fig. 20-(f)), even with only 
3 input images. 

Our method can correctly detect the outliers and also 
robustly align the images even if the geometric model has 
more degrees of freedom than an affine homography model. 
Detailed comparisons in Fig. 22 show the average image ob¬ 
tained from the aligned image stack by each method. If well 
aligned, the average image should show seamless image 
without duplicated edges. Our results show fine average 
images due to more accurate homography estimation than 
RASL. 

4.2.5 Image Recovery 

Images of natural scenes follow natural statistics [25]. As 
shown by Hu et al. [24], information of image scenes is 
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Mean error (in degrees) 

Max error (in degrees) 

Standard deviation 

No. Image 

LS [42] 

Wu et al [45] 

Ours 

LS [42] 

Wu et al [45] 

Ours 

LS [42] 

Wu et al [45] 

Ours 

5 

8.53 

27.88 

7.06 

159.72 

130.77 

120.78 

14.48 

16.45 

12.30 

8 

9.03 

13.34 

5.87 

142.45 

139.07 

85.48 

11.24 

10.96 

9.62 

10 

9.24 

11.14 

5.70 

148.05 

110.12 

79.54 

9.91 

9.77 

8.06 

12 

8.96 

9.95 

5.09 

130.04 

80.21 

76.86 

9.17 

9.02 

7.59 


TABLE 3: Photometric stereo results of Bunny with 5% corruption ratio, additional specularities and shadows. Bold fonts indicate highest accuracy. 



(a) Input (b) LS (c)Wuefa/. (d) Ours (e) LS (f)Wuefa/. (g) Ours 


Fig. 19: Photometric stereo results from 5 (top) and 12 (bottom) images of Bunny dataset with corruption, (a) A representative input image, (b-d) Recovered 
surface normals by LS [42], Wu etal. [45] and ours, (e-g) Corresponding error maps for each algorithm. 
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Fig. 21: Aligning planar surfaces despite occlusions by RASL [38] and 
ours with n = 4 images. Affine transformation is used as the geometric 
transformation model g. 
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Fig. 22: Close-up comparisons of Fig. 21 (see red box in the left) with the 
average images of aligned results Oog and recovered low-rank components 
A between RASL [38] and ours. 


dominated by the top 20 singular values, which is low- 
rank. Hu et al. [24] proposed a matrix completion method 
with the truncated nuclear norm (TNN) as introduced in 
Sec. 2. We formulate the matrix completion as 

argmin||A||p, s.t. A = B, Pn(B) = Pn(0), (25) 

A,B 

where Vn{') is the orthogonal projection operator setting 
[Vn{^)]i,j = for G ft and 0 otherwise. Although 
the auxiliary variable B looks unnecessary, with the affine 
constraint (the projection operator) it makes the efficient 
PSVT operator applicable in ADMM algorithm. We set p = 
1.05 in ADMM (see the supplementary material and Alg. 
2), and for fairness, we follow the same setting as suggested 
by Hu et al, i.e. po = le“^. 

Fig. 23 shows the comparison with Hu et al. Since the 
proposed method has a similar objective function with 
Hu et al., PSNR of recovered images are similar (the max. 
difference is 0.134), but our method requires fewer itera¬ 
tions and runs 4 times faster. Because Hu et al. perform local 
approximation to solve PSSV, their algorithm requires outer 



Index 1 2 3 4 5 6 7 8 


PSNR 


[24] 

Ours 


30.428 

30.429 


21.741 

21.768 


24.506 

24.507 


32.282 

32.216 


24.375 

24.376 


42.145 33.357 21.293 
42.011 33.282 21.325 


Iter. 


[24] 

Ours 


727 

210 


639 

192 


678 

181 


1251 

216 


799 

201 


Time 

(sec) 


[24] 

Ours 


18.60 

4.06 


18.36 

4.30 


15.49 

4.48 


18.77 

4.20 


15.87 

4.09 


16.59 

3.94 


20.84 

3.07 


19.35 

4.47 


Fig. 23: Image recovery application. Top: Original images (ground truth). 
Middle: Observed images with 50% missing entries (input). Bottom: Recov¬ 
ered image by our method. Table: Ouantitative comparison with Flu et al. 
[24] for PSNR, number of iterations, and running time. 


loop to fix subspaces and inner loop to minimize nuclear 
norm with affine constraint. In contrast, our method directly 
minimizes PSSV, which is a key contribution of our work. 


5 Discussions and Conclusion 

In this paper, we revisited the rank minimization method in 
RPC A for low-level vision problems. When the target rank 
is known, we show that, by modifying the objective func¬ 
tion from the nuclear norm to PSSV, we can achieve a better 
control of the target rank of the low-rank solution, even 
when the number of observations is limited. The appealing 
advantage of our solution is that it can be easily utilized 
in existing algorithms, e.g. ADMM [31], and the efficient 
computation properties still hold. The generality and the 
effectiveness of our approach are supported through nu¬ 
merous and extensive experiments on both synthetic exam¬ 
ples and several real-world applications which outperform 
the conventional nuclear norm objective function. We do 
not consider scalability issues of our method in this paper, 
but the recent approach suggested by Oh et al. [37] allow 
to speed-up the application of our method. An interesting 
direction of future work is the mathematical analysis of the 
properties of our partial sum objective function compared 
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to the nuclear norm solution. In the following, we discuss 
some open questions related to our paper. 

Sufficient number of samples versus minimum number 
of samples In our experimental analysis, we found that 
our solution is more robust than the nuclear norm solution 
when facing a limited number of samples. Defining K as the 
(theoretical) minimum number of samples for processing, 
e.g. 2 images for HDR, 3 images for photometric stereo, our 
approach requires more than K samples for a robust model 
estimation and outlier rejection. We believe that the number 
of needed additional samples depends on the problem 
setting, e.g. the shape of feature space or the distribution 
of the samples. 

Target rank While our formulation implicitly encourages a 
target rank constraint in the resulting matrix, this constraint 
is hardly enforced. We discuss here two possible scenarios 
that can produce the resulting matrix having a rank dif¬ 
ferent from the target rank. The first scenario is when a 
very limited number of samples are observed. In such case, 
PSVT can produce a deficient rank lower than the target 
rank when the span of the observed samples is less than 
the target rank, but this case is a fundamental limitation 
of under-sampling rather than a conceptual limitation of 
our approach. Another scenario is due to too much noise 
(especially for Gaussian noise that does not follow the 
sparsity property) in the observed samples which results 
in large singular values in the residual ranks. In this case, 
a solution to satisfy the rank constraint is to increase r in 
Eq. (17). When r is equal to infinity, our PSVT solution 
is close to the result using singular value projection [27]. 
However, the projection method enforcing target rank could 
produce an over-fitting solution due to the mentioned noise 
effects. 
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Supplementary Material 

This is the supplementary material for the main paper [5]. In this supplementary material, we prove Proposition 
1, and we provide the pseudo code of the algorithm of image recovery application. We also present an additional 
experimental result not included in the main paper due to space limitation. All the parameters are the same as in 
the main paper or the referred papers, except if stated otherwise. 

1 Proofs of Proposition 1. 

Lemma 4 (Lipschitz continuous of PSSV). The function of the partial sum of singular values h{X.) = ||X||p = cr^(X) 

(where p e N denotes the target rank) for X G is Lipschitz continuous. Namely, there exists a constant scalar K 

satisfying 

|/i(Xi) - /i(X 2 )| < K . ||Xi - X 2 IIF for all Xi,X 2 G 

Proof Let the nuclear norm be /(X) = ||X||* = cri(X), and the Ky-Fan p-norm be ^(X) = \\'X\\xy{p) = 

o'i(X). By definition, h(X) = ||X||p = f (X)—g(X), and we loiow that the nuclear norm /(•) [4] and the Ky-Fan 
p-norm ^(•) [6] are Lipschitz continuous (derivation of the Lipschitz continuity for Ky-Fan matrix norm is straight 
forward). Therefore we have 

|/(Xi)-/(X2)|<i^/.||Xi-X2||i., 

|^(Xi)-^(X2)| <i^,-||Xi-X2||F, (1) 

where Kf and Kg are Lipschitz constants for / and g respectively. 

We see that 

\h{X,) - h{X 2 )\ =|/(Xi) - ^(Xi) - (/(X 2 ) - ^(X 2 ))| 

= |/(Xi)-/(X2) + (^(X2)-^(Xi))| 

<|/(Xi) - /(X 2 )| + l^(Xi) - ^(X 2 )| (by triangle inequality) 

<(A; + A,).||Xi-X2||i.. 

Since the constant K = Kf ^ Kg satisfies the inequality, h{X) = ||X||p is Lipschitz continuous. □ 

Since PSSV || • ||p is a non-convex function, the typical subdifferential for convex functions (a.k.a. Fenchel-Moreau 
subdifferential for convex functions. Refer to Daniilidis et al. [3]) would be the empty set. Therefore, we introduce 
a generalized subdifferential (a.k.a. Clarke subdifferential, see Definition 3.2 in [1]) for non-convex locally Lipschitz 
continuous functions. This is useful for deriving stationary points in the convergence proof. 

Definition 2 (Generalized subgradients). Let f : ^ R be a locally Lipschitz continuous function at a point x G R'^. 

Then the subdifferential of f at is the set dcfi'x.) of vectors z G such that 

dcf{^) = {z : f°(x; d) > (z, d) for all d G , (2) 

where each vector z G dcfi'x.) is called a subgradient of f at x, and the directional subgradient of f at x in the direction 
vector d G is defined as /°(x; d) = limsup ^ 

y^x 

t4,0 
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Remark D.2.1. Definition 2 can he generalized to matrix cases analogously. 

Remark D.2.2. Regardless of non-convexity or non-smoothness, the generalized subdifferential (here, Clarke subdifferential) 
always exists for locally Lipschitz continuous functions. Also, dc\\ • \\p is well defined in because || • \\p is a Lipschitz 

continuous function as shown in Lemma 4. 

The following Lemma is also important for the convergence proof of Proposition 1. 

Lemma 5 (Convexity and compactness of subdifferential [1]) Let f : R’^xn ^ r ^ locally Lipschitz continuous 
function at X. Then the subdifferential dcfOQ is a non-empty, convex and compact set. 

Remark L.5.1. Basic properties of the generalized subdifferential are identical to those in convex case, and most of subdifferential 
calculus rules hold. Note that Karush-Kuhn-Tucker (KKT) optimality conditions are also properly defined with the 
generalized subdifferential [1]. 

Since our problem in Eq. (3, Main) does not have any inequality constraint, but an equality constraint, the 
KKT conditions are reduced to stationary and primal feasibility conditions. Armed with the above lemmas and 
definitions, we can now propose and prove the convergence of Alg. 1. 

Proposition 1 (Convergence). Let Sk = (A/^, E/^, Y/,, Y/e)/ ^here Y/^+i = Y/, + pk{0 - A/^+i - E/^) and is a 

set of intermediate solutions of Alg. 1. Suppose that {Y/c}^i and bounded, lim (Y/^+i — Yp) = 0, and pk is 

k^oo 

non-decreasing, then any accumulation point of {Sk}^=i satisfies the following KKT conditions: 

(Cl) Y* G dc||A*||^, (C2) Y* G d||AE*||i, (C3) O - A* - E* = 0, (C4) dc\\A^\\p H d||AE*||i ^ 0, (3) 

where Y*, A* and E* represent each cluster points. In particular, whenever converges, it converges to a KKT point 

of Eq. (2, Main). 

Proof. For Y, we have pp^(Yk+i — ^k) = O — A/^+i — E/^+i. Since lim (Y/^+i — Yp) = 0 and pk is non-decreasing, 

k^oo 

we have O — A/^+i — E/^+i = pp^{Yk-\-i — Yp) 0, which satisfies (C3). 

Since E/^+i obtained by the soft-thresholding operator [2] minimizes (A/^+i,E, Y/^) (refer to Eq. (10, Main)) 
by definition, we have 

0 Gd||AE/c+i111 — Yp — pk{0 — A/c+i — E/i;+i) 

=d||AE,+i||i-Y,+i 

(since Y/^+i = Yp + Pk{0 — A/^+i — E/^+i)) 

^Y/e+i G d||AE/c+i111, 

which satisfies (C2). 

Since A/^+i optimally obtained by the PSVT minimizes (A, E/^, Y/^) by Theorem 1 and Lemma 5, we have 

0 Gdc||A/c+i||p — Yp— Pk{0 — A/e+i - E/e) 

—d(7 II A/e_|_i lip Y/e Pp(C) A/e_|_i E/e_|_i) ///e(E/e_|_i E/^) 

(o) 

=dc||A/e+i||p - Y/e +1 - ///c(E/e+i - E/e) (by definition of Y/e+i) 

^Y/e-|_i + /i/e(E/e-|_i E/e) G d(7 || A/e-|_i ||p. 

Since {Y/e}^^ and {Y/e}^^ are bounded, there must exist a scalar c > 0 such that ||Y/e+i||i7 < c and ||Y/e+i||F < c. 
Then, 

Y/e-|-l Y/e-|-i = Pk (/^/e (Y/e-|-i Y/e) (O A/e-|-i E/e)^ 

= Pk ((O — A/e+i — E/e+i) — (O — A/e+i — E/e)) = Pk{E^k “ E/e+i) 

^ ||E,+i - E.IIf = %'||Y,+i - Y,+i||^ (6) 

<Pp^{\\Yk^i\\F + ||Y/e+i||F) (by triangle inequality) 

< 2cpp^ 0 (since pk is non-decreasing). 

Thus, Y/e+i ^ Y* G dc||A*||p from Eq. (5) and Y* G dc||A*||p fi d||AE*||i / 0 which satisfy (Cl) and (C4). The 
sequence {Sk}^=i gradually satisfies the KKT conditions, which completes the proof. □ 

Remark P.1.1. In Alg. I (and Proposition I), the assumption for non-decreasing pk is always satisfied by the update rule 
PkFi = PTk (with p > 1), and the boundness of the sequences and {Y/e}^;^ i^ satisfied by the below Lemmas 6 
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and 7 in conjunction with a similar manner of Lemma 1 in Lin et al. [4]. Therefore, we see that Alg. 1 converges as long as 
Y/c converges. 


Lemma 6 (Boundness of |/°('; •)! [1]) ^Rbe a locally Lipschitz continuous function at X, and d G 

Then, there exists a scalar B such that 

|/°(X;d)| <B||d||f. (7) 


Proof Refer to the proof of Theorem 3.1 in [1]. □ 


Lemma 7 (Boundness of Clarke subgradient) Let f : be a locally Lipschitz continuous function at X. Then, 

a subgradient W G dcf(X) is bounded as 

l|W||^ < B, (8) 

where B is the same constant as in Lemma 6. 

Proof By the definition of the Clarke subdifferential, W satisfies /°(X;d) > (W,d) for all d G By setting 

d = W, we get /°(X; W) > (W, W) = ||W|||.. Then, by Lemma 6, we have 

IIWIII < |r(X; W)| < i?||W||^ IIWIII < BIIWII^ 

^ IIWIliT < B (9) 

□ 


2 Algorithm for Image Recovery 


Algorithm 2 ADMM for Image Recovery 

Input : O G the index map Q, the constraint rank N. 

Initialize Ao = O, Bo = Zo = 0, /xo > 0, p > 1 and k = 0. 
while not converged do 

Afe + I = ~ Tk '^k\- 

Bfe+1 = argmin ||B - (A^+i + 

Vn{^)=Vn{0) 

Zfe+i = Zfe + /xfe(Afe+i — Bfe+i). 

Uk+i — PUk- 

k — k -\-l. 

end while 
Output : Afe. 
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